74 research outputs found
Self-Supervised Sketch-to-Image Synthesis
Imagining a colored realistic image from an arbitrarily drawn sketch is one
of the human capabilities that we eager machines to mimic. Unlike previous
methods that either requires the sketch-image pairs or utilize low-quantity
detected edges as sketches, we study the exemplar-based sketch-to-image (s2i)
synthesis task in a self-supervised learning manner, eliminating the necessity
of the paired sketch data. To this end, we first propose an unsupervised method
to efficiently synthesize line-sketches for general RGB-only datasets. With the
synthetic paired-data, we then present a self-supervised Auto-Encoder (AE) to
decouple the content/style features from sketches and RGB-images, and
synthesize images that are both content-faithful to the sketches and
style-consistent to the RGB-images. While prior works employ either the
cycle-consistence loss or dedicated attentional modules to enforce the
content/style fidelity, we show AE's superior performance with pure
self-supervisions. To further improve the synthesis quality in high resolution,
we also leverage an adversarial network to refine the details of synthetic
images. Extensive experiments on 1024*1024 resolution demonstrate a new
state-of-art-art performance of the proposed model on CelebA-HQ and Wiki-Art
datasets. Moreover, with the proposed sketch generator, the model shows a
promising performance on style mixing and style transfer, which require
synthesized images to be both style-consistent and semantically meaningful. Our
code is available on
https://github.com/odegeasslbc/Self-Supervised-Sketch-to-Image-Synthesis-PyTorch,
and please visit https://create.playform.io/my-projects?mode=sketch for an
online demo of our model.Comment: AAAI-202
BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning
An ever increasing number of configuration parameters are provided to system
users. But many users have used one configuration setting across different
workloads, leaving untapped the performance potential of systems. A good
configuration setting can greatly improve the performance of a deployed system
under certain workloads. But with tens or hundreds of parameters, it becomes a
highly costly task to decide which configuration setting leads to the best
performance. While such task requires the strong expertise in both the system
and the application, users commonly lack such expertise.
To help users tap the performance potential of systems, we present
BestConfig, a system for automatically finding a best configuration setting
within a resource limit for a deployed system under a given application
workload. BestConfig is designed with an extensible architecture to automate
the configuration tuning for general systems. To tune system configurations
within a resource limit, we propose the divide-and-diverge sampling method and
the recursive bound-and-search algorithm. BestConfig can improve the throughput
of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce
the running time of Hive join job by about 50% and that of Spark join job by
about 80%, solely by configuration adjustment
Diffusion Guided Domain Adaptation of Image Generators
Can a text-to-image diffusion model be used as a training objective for
adapting a GAN generator to another domain? In this paper, we show that the
classifier-free guidance can be leveraged as a critic and enable generators to
distill knowledge from large-scale text-to-image diffusion models. Generators
can be efficiently shifted into new domains indicated by text prompts without
access to groundtruth samples from target domains. We demonstrate the
effectiveness and controllability of our method through extensive experiments.
Although not trained to minimize CLIP loss, our model achieves equally high
CLIP scores and significantly lower FID than prior work on short prompts, and
outperforms the baseline qualitatively and quantitatively on long and
complicated prompts. To our best knowledge, the proposed method is the first
attempt at incorporating large-scale pre-trained diffusion models and
distillation sampling for text-driven image generator domain adaptation and
gives a quality previously beyond possible. Moreover, we extend our work to
3D-aware style-based generators and DreamBooth guidance.Comment: Project website: https://styleganfusion.github.io
TIME: Text and Image Mutual-Translation Adversarial Networks
Focusing on text-to-image (T2I) generation, we propose Text and Image
Mutual-Translation Adversarial Networks (TIME), a lightweight but effective
model that jointly learns a T2I generator G and an image captioning
discriminator D under the Generative Adversarial Network framework. While
previous methods tackle the T2I problem as a uni-directional task and use
pre-trained language models to enforce the image--text consistency, TIME
requires neither extra modules nor pre-training. We show that the performance
of G can be boosted substantially by training it jointly with D as a language
model. Specifically, we adopt Transformers to model the cross-modal connections
between the image features and word embeddings, and design an annealing
conditional hinge loss that dynamically balances the adversarial learning. In
our experiments, TIME achieves state-of-the-art (SOTA) performance on the CUB
and MS-COCO dataset (Inception Score of 4.91 and Fr\'echet Inception Distance
of 14.3 on CUB), and shows promising performance on MS-COCO on image captioning
and downstream vision-language tasks.Comment: AAAI-202
How Do Price and Quantity Promotions Affect Hedonic Purchases? An ERPs Study
Due to consuming hedonic products unnecessary to basic well-being, consumers need justifications for pleasure. However, different justifications have differential influences in promoting hedonic purchases, such as price and quantity promotions (PP and QP), the difference being that the latter requires purchasing additional units to get the same discount as the former. In the present study, even-related potentials (ERPs) was applied to reveal the timing of brain activities to further understand how promotion information consisting of promotion type (PP and QP) and discount depth, deep and shallow discounts (DD and SD) on hedonic products was processed. Behaviorally, consumers were more willing to purchase items in PP and DD conditions than QP and SD conditions, respectively, and spent more time making final purchase decisions in QP and DD condition or PP and SD condition compared to PP and DD condition. Neurophysiologically, DD automatically recruited more attentional resources than SD and led to a higher P2 amplitude. QP and DD condition or PP and SD condition evoked a larger N2 amplitude and enhanced perceptual conflict compared to PP and DD condition. During late stage, PP and DD elicited a more positive LPP amplitude in contrast to QP and SD, respectively, indicating that people have stronger purchase intention and positive affect in PP and DD contexts. These findings provided evidence for the differential influences between PP and QP and what ultimately made consumers buy hedonic products or not
Improving Negative-Prompt Inversion via Proximal Guidance
DDIM inversion has revealed the remarkable potential of real image editing
within diffusion-based methods. However, the accuracy of DDIM reconstruction
degrades as larger classifier-free guidance (CFG) scales being used for
enhanced editing. Null-text inversion (NTI) optimizes null embeddings to align
the reconstruction and inversion trajectories with larger CFG scales, enabling
real image editing with cross-attention control. Negative-prompt inversion
(NPI) further offers a training-free closed-form solution of NTI. However, it
may introduce artifacts and is still constrained by DDIM reconstruction
quality. To overcome these limitations, we propose Proximal Negative-Prompt
Inversion (ProxNPI), extending the concepts of NTI and NPI. We enhance NPI with
a regularization term and reconstruction guidance, which reduces artifacts
while capitalizing on its training-free nature. Our method provides an
efficient and straightforward approach, effectively addressing real image
editing tasks with minimal computational overhead.Comment: Code at https://github.com/phymhan/prompt-to-promp
Effect of Temperature on Electromagnetic Performance of Active Phased Array Antenna
Active phased array antennas (APAAs) can suffer from the effects of harsh thermal environments, which are caused by the large quantity of power generated by densely packed T/R modules and external thermal impacts. The situation may be worse in the case of limited room and severe thermal loads, due to heat radiation and a low temperature sink. The temperature field of the antenna can be changed. Since large numbers of temperature-sensitive electronic components exist in T/R modules, excitation current output can be significantly affected and the electromagnetic performance of APAAs can be seriously degraded. However, due to a lack of quantitative analysis, it is difficult to directly estimate the effect of temperature on the electromagnetic performance of APAAs. Therefore, this study investigated the electromagnetic performance of APAAs as affected by two key factors—the uniformly distributed temperature field and the temperature gradient field—based on different antenna shapes and sizes, to provide theoretical guidance for their thermal design
Strongly Secure Authenticated Key Exchange from Supersingular Isogenies
This paper aims to address the open problem, namely, to find new techniques to design and prove security of supersingular isogeny-based authenticated key exchange (AKE) protocols against the widest possible adversarial attacks, raised by Galbraith in 2018. Concretely, we present two AKEs based on a double-key PKE in the supersingular isogeny setting secure in the sense of CK, one of the strongest security models for AKE. Our contributions are summarised as follows. Firstly, we propose a strong secure PKE,
, based on SI-DDH assumption. By applying modified Fujisaki-Okamoto transformation, we obtain a secure KEM, . Secondly, we propose a two-pass AKE, , based on SI-DDH assumption, using as a building block. Thirdly, we present a modified version of that is secure against leakage under the 1-Oracle SI-DH assumption. Using the modified as a building block, we then propose a three-pass AKE, , based on 1-Oracle SI-DH assumption. Finally, we prove that both and are CK secure in the random oracle model and supports arbitrary registration. We also provide an implementation to illustrate the efficiency of our schemes.
Our schemes compare favourably against existing isogeny-based AKEs. To the best of our knowledge, they are the first of its kind to offer security against arbitrary registration, wPFS, KCI and MEX simultaneously. Regarding efficiency, our schemes outperform existing schemes in terms of bandwidth as well as CPU cycle count
- …